Bioprocess data mining using regularized regression and random forests
نویسندگان
چکیده
منابع مشابه
Mining bioprocess data: opportunities and challenges.
Modern biotechnology production plants are equipped with sophisticated control, data logging and archiving systems. These data hold a wealth of information that might shed light on the cause of process outcome fluctuations, whether the outcome of concern is productivity or product quality. These data might also provide clues on means to further improve process outcome. Data-driven knowledge dis...
متن کاملInsights of Data Mining for Small and Unbalanced Data Set Using Random Forests
Because random forests are generated with random selection of attributes and use samples that are drawn by boostraping, they are good for data sets that have relatively many attributes and small number of training instances. In this paper an efficient procedure that considers the property of data set having many attributes with relatively small number of attributes in arrhythmia is investigated...
متن کاملPathway analysis using random forests classification and regression
MOTIVATION Although numerous methods have been developed to better capture biological information from microarray data, commonly used single gene-based methods neglect interactions among genes and leave room for other novel approaches. For example, most classification and regression methods for microarray data are based on the whole set of genes and have not made use of pathway information. Pat...
متن کاملExploratory Data Analysis using Random Forests
Although the rise of "big data" has made machine learning algorithms more visible and relevant for social scientists, they are still widely considered to be "black box" models that are not well suited for substantive research: only prediction. We argue that this need not be the case, and present one method, Random Forests, with an emphasis on its practical application for exploratory analysis a...
متن کاملPerformance of random forests and logic regression methods using mini-exome sequence data
Machine learning approaches are an attractive option for analyzing large-scale data to detect genetic variants that contribute to variation of a quantitative trait, without requiring specific distributional assumptions. We evaluate two machine learning methods, random forests and logic regression, and compare them to standard simple univariate linear regression, using the Genetic Analysis Works...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Systems Biology
سال: 2013
ISSN: 1752-0509
DOI: 10.1186/1752-0509-7-s1-s5